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Comparison of the Incremental Validity 
of the Old and New MCAT 



Abstract 



The predictive and incremental validity of both the Old and New Medical College 
Admission Test (MCAT) was examined and compared with a sample of over 300 medical 
students. Results of zero order and incremental validity coefficients, as well as 
predict^pn models resulting from all possible subsets regression analyses using Mallow's C 
criterion, were subjected to cross-validation analyses by randomly dividing two medical 
school classes into screening and calibration samples. Results supported the incremental 
validity ,of both the Old and New MCAT. Coefficients were generally larger for the New 
than for the Old MCAT. Prediction models of NBME Part I and II performance, comprised 
of the New Biology and Chemistry subtests and the Old Science and General Information ^ 
subtests were cross-validated. Prediction models of clinical evaluation clerkship 
performance were equivocal. 
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Incremental Validity 1 

A number of studies recently have compared the ability of the Qld and the New 
MCAT to predict student achievement in medical school. Since the New MCAT was first 
used in 1978, most studies have focused on the cognitive outcomes of student performance 
in medical school during the basic science curriculum. As students admitted on the basis 
of their New MCAT Scores, along with other admission criteria, progress through the 
medical curriculum, the contribution of the new MCAT in predicting other outcome 
measures such as clinical performance is beginning to be studied (e.g. Carline, Cullen A: 
Scott, 1982; Carline et al, 1983; Hull, Calhoun & Maxim, 1981). Findings presented in the 
Association of American Medical Colleges 1 New Medical College Admission Test 
Interpretive Manual (1977), as well as the results from local analyses, indicate that the 
test has strong predictive validity for cognitive outcomes during the first and second 
years in medical school.* McGuire (1980) investigated the relationship of the New .MCAT 
to the criterion of class standing at the end of the freshman year. Results of a 
correlation analysis revealed that all of the New MCAT subscales except for Skills 
Analysis: Reading correlated significantly with class standing (£ .001). Similar results 
were also found for undergraduate GPA and undergraduate science GPA. Among the 
group of predictors used to create a revised admissions prediction index, maximum 
predictability was achieved by undergraduate GPA and the New MCAT Science Problems 
score, which is calculated from problem-solving items involved in the biology, chemistry 
and physics components. As Jones and Thomae-Forgues (1981) point out, the predictive 
ability of the New MCAT in relation to performance in the basic medical sciences is not 
surprising in view of the heavy emphasis placed in development on the science and 
medicine-related content relevance of test items. Performance in medical school has 
long been considered to be related to science achievement; therefore, those who succeed 
in the first science-based academic quarter have little trouble completing their medical 
studies (Cullen et al, 1980). Furthermore, Dawson -Saunders and Doolen (1981) and Jones 
and Thomae-Forgues (1981) discussed the New MCAT's potential value as a predictor of 
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clinical performance. Due to the increased emphasis on interpretation and problem 
solving in the new format, they suggested that the new MCAT may result in measures 
which are more closely associated with the information gathering, evaluation, and 
utilization skills required during the clinical experience. 

Hull et al (1981) found validity coefficients of the New MCAT subtests with NBME 
Part I scores (basic science) to be consistently larger than the validity coefficients of 
undergraduate grades with NBME I. Carline et al (1983) also found the New MCAT 
subtests to be superior to grades in validity coefficients for NBME Part II scores (clinical 
science). In reviewing the success of the New MCAT "in predicting medical school 
performance, Jones and Thomas- For gues (1982) noted several patterns in the America 
Medical College Application Service data set. Among these patterns were the MCAT's 
^ ability to predict NBME Part I performance better than undergraduate college grades and 
that "predictions of medical school course grade performance based on MCAT scores and 
undergraduate college grades are better than those based on either one alone" (p.6). 

Friedman and his collegues (1980, 1981) have used the method of incremental validity 
(Sechrest, 1963) to illustrate the utility of both New and Old MCAT scores in improving 
the amount of variance accounted for in measures of medical student performance beyond 
that accounted for by other preadmission measures." They found that the explained 
variance in examination performance during the first two years was proportionately 
greater when the New MCAT was used in place of the Old MCAT in a stepwise regression 
analysis after all other admissions variables were included. Their analyses also revealed 
that the New MCAT's incremental predictive power was higher for nationally standardized 
outcome measures as compared to that for locally prepared achievement tests. When 
NBME Part I examination scores on the Microbiology and Anatomy subtests were used as 
criterion measures, the most valuable New MCAT predictors were the Biology, Science 
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Problems, and Skills JB&felysis: Quantitative assessment subtests* Erdmann (1980) 
characterized the re«|(fs of the "first round" of MCAT studies as encouraging, and called 
for the next phase to focus on "the range of relationships between test scores and various 
criterion measures the consistency of these findings over time, and the pattern of 
relationships with performance measures obtained et successive stages of the medical 
education process" (p. 464). 

Among the preadmission measures included as predictor (independent) variables in 
the two incremental validity studies previously reported were undergraduate grade-point 
average, selectivity of undergraduate school, marital status, age, "quantitativeness" of 
undergraduate major, parental education and income, and hometown community size, as 
well as MCAT subtest scores (Friedman <3c Bakewell, 1980; Friedman <k Porter, 1981). 
Dependent (criterion) variables included a composite first year medical school 
examination score and two NBME Part I subtest scores (Microbiology and Anatomy). 
While many of the non-MCAT preadmission measures were significant predictors of, and 
explained additional significant variance in, these criterion , measures in multiple 
regression analyses, some of these measures (e.g., marital status, parental education and 
income, hometown community size) are of questionable utility for making admission 
decisions, given their sensitivity and potential legal implications. However, the results 
reported by Friedman and colleagues are encouraging because the inclusion of MCAT 
subtest scores explained additional significant variance in medical student performance 
even when non-traditional admission measures were included as predictors. It is likely, 
therefore, that their results were conservative because the additional f>on-traditional 
predictor variables reduced the amount of potential incremental variance available to be 
accounted for by the MCAT subscores. 

The purpose of the present study was to examine the ability of the New Medical 
College Admissions Test (MCAT) to predict medical students 1 performance on measures of 
both basic and clinical science, and to compare the New MCAT's performance with that of 
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the Old MCAT. The specific research questions included: (1) do the Old and New MCAT 
scores correlate positively and significantly with (a) basic science performance as 
measured by Part I of 'the National Board of Medical Examiners' examination (NBME Part 
I), (b) clinical science performance is measured by NBME Part II, (c) house officer ratings 
of students' clinical problem solving skills, and (d) house officer ratings of students 1 
clinical interpersonal skills?; (2) do the Old and New MCAT scores explain significant 
incremental (predictive) variance in these four performance measures beyond that 
accounted for by other preadmission variables?; and (3) is the predictive power of the New 
M£AT superior t<^that of the Old MCAT? 

There are several distinctions between the present study and previous studies. First, 
only more traditional preadmission measures that are routinely used in making admission 
decisions were included as predictor variables in addition to MCAT scores. Second, 
criterion measures included faculty ratings of student clinical performance and student 
NBME Part II (clinical science), as well as Part I (basic science), total scores. The clinical 
performance and NBME Part II measures represent outcome measures in New MCAT 
validity studies that have just begun to be examined (Carline et al, 1983; Hull, Calhoun & 
Maxim, 1981). Additionally, there are methodological differences between this study and 
those previously reported in the literature. These differences are discussed in the 
following section and focus primarily on the type of regression analyses performed, cross- 
validation procedures, and the exclusion of the New MCAT Science Problems subtest in 
multivariate analyses. 

Methodology 

Instrumentation and Sampling 

Preadmission measures, clinical clerkship evaluation ratings, and NBME Part I and II 
total scores were obtained for persons who entered the four year curriculum at The 
University of Michigan Medical School in 1977 and 1978. These will be referred to as the 
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classes of 1981 and 1982, respectively, their year of graduation. The preadmission 
variables included undergraduate science and nonscience CPA, MCAT scores for each 
component of the Old MCAT for the class of 1981 and New MCAT for the class of 1982, 
and a mean interview rating assigned by the medical* school faculty members who 
interview applicants. These represent the quantitative preadmission predictor measures 
available for students at the time pf their application to medical school. 

The clinical evaluation ratings and NBME scores represent the criterion medical 
school performance measures examined in the study. Clinical' evaluation scores were 
obtained from faculty ratings of student performance on a clinical evaluation form (CEF) 
completed for each student during their required third year clerkship in Internal Medicine. 
Although most disciplines use the CEF, Internal Medicine was selected as representative 
to control for variations among ratings attributable to clerkship disciplines. Because 
several faculty members and house staff complete CEFs for each student, one CEF was 
randomly selected for each student from all CEFs completed by house officers for that 
student during the last four weeks of his/her twelve week clerkship. House officer CEFs 
were selected because previous studies have indicated their evaluations tend "to have 
higher inter-rater agreement and correlate higher with NBME Part II scores than do 
faculty evaluations (Hull, 1982). The two subscores of the CEF, one representing problem 
solving skills and the other representing interpersonal skills, were used in the analyses for 
each student. An analysis of the reliability and validity of the<2EF is presented elsewhere 
(Dielman, Hull «5c Davis, 1980). 

Total sample sizes ranged between 155 and 185 subjects for analyses pertaining to 
an entire class. Subjects were randomly divided into two sub-samples, a screening sample 
and a calibration sample, in order to cross- validate the results obtained in the multiple 
correlation/regression analyses (Kerlinger & Pedhazur, 1973; Lord & Novick, 1968) 
described in the following section. All data were analyzed for each sub-sample 
independently and again for the total combined sample. ' 
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Capitalization on chance in the development of a degression/prediction model based 
on sample correlations is a well known problem (Lord & Novick, 1968). Because these 
sample correlations are based not oaly on true correlation among the variables, but also 
contain sampling error, the multiple correlation typically "shrinks" when these variables 
are used on a new sample. Both Lord and Navick (1968) and Kerlingej and .Pedhazur (1973) 
recommend cross-validation procedures to address this problem. Cross-validation 
necessitates obtaining two samples. The first sample is referred to as the screening 
sample, and is used to develop the regression equation and multiple R 2 . The predictor 
variables of the second sample, referred to as the calibration sample, are then applied to 
the regression equation obtained from the screening sample to obtain predicted scores for 
the criterion variable. The observed criterion sco^s (y) for the calibration sample are 

then correlated with the predicted criterion scores (y 1 ). This Pearson r , is analogous to 

* & 

a multiple correlation between the observed and predicted scores. In the present study, 
this procet^re was applied twice in order to allow each sub-sample to constitute the 
screening (and calibration) sample. This "double cross-validation procedure is strongly 
recommended as the most rigorous approach to the validation of results from regression 
analysis in a predictive framework" (Kerlinger and Pedhazur, 1973, p. 284)1 Results of the 
two regression equations, multiple R 2 s and r yyt s obtained from alternate samples were 
then compared. Analyses of the data were performed retrospectively ajt\d were not used 
in making admission decisions. 
Correlational and Incremental Validity Analyses 

Pearson zero order correlations were computed to test the research hypotheses of a" 
significant positive relationship between each of the MCAT sub&cores and the four , 
criterion performance measures. Incremental validity (Sechrest, 1963) was examined by 
using a step-wise, hierarchical multiple regression analysis design involving a two step 
procedure. In the first phase, all non-MCAT preadmission variable^ were simultaneously 
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included in the analysis. Only after alJ these non-MCAT variables were included were the 
MCAT subscores simultaneously stepped in the second phase of the analysis. Four 
seperate analyses were performed, on* for each of the criterion measures. These analyses 
permitted an examination of the usefulness of the MCAT subtests in explaining additional 
variance in the criterion measures beyond that already explained by the non-MCAT 
admission variables. & 

Three separate indices of MCAT incremental validity were calculated. The first 
index indicates the absolute amount of variance (as measured by^muTtiple R 2 ) explained 
for each of the four criterion measures by the MCAT subtest scores when they are 
steppedjnto the multiple regression analysis after all the non-MCAT preadmission 
measures have been included (Sechrest, 1963). This index was determined using formula 1. 

» 

2 ? 
Index 1 = (R for all variables) - {R z for non-MCAT variables) (I) 

= R 2 added by MCAT 

t 

The second index provides a measure <Sf the proportionaj increase in performance 
variance explained by stepping in MCAT scores last in the regression analysis and was 
calculated using formula 2 below. 

f ' . 

Index 2 = R 2 added by MCAT * (7 \ 

2 <5V 
R for non-MCAT variables 



The third index provides a measure of the proportional increase in performance 
variance that is unaccou nted for by the non-MCAT measures and that is explained by 
adding the MCAT scores to the regression analysis. This index was calculated using 

formula 3 below. Friedman and Porter (1981) argued for the inclusion of both of these 

i • *' 
later two indexes in order to minimize artifactual differences in incremental validity 
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between the two MCAT versions due to varying amounts of non-MCAT explained variance 
remaining for differing medical school classes. ' 

Index 3 = R 2 added by MCAT (3) 

2 

1 - (R for non-MCAT variables) 

Because the scores on the New MCAT Science Problems subtest are derived from a 
subset of the items that comprise three other New MCAT subtests, Biology, Chemistry,, 
and Physics, Jhis subtest is by definition linearly dependent upon these other subtests. 
Thus, while "scores on the six New MCAT areas of assessment are designed to be 
relatively independent and are purposefully reported seperately .... items from the 
Science Problems subtest contribute twice to New MCAT scores" (New MCAT 
Interpretive Manual, 1977). This issue has been addressed in several New MCAT validity 
studies (Hull, Calhoun <5c Maxim, 1981; Jones <5c Thomae-Forgues, 1981) by excluding the 
Science Problems subtest from multivariate analyses, while it has been included in most 
other studies (e,g. Carline, Cullen <5c Scott, 1982; Carline et al, 1983; Friedman <5c 
Bakewell, 1980; Friedman & Porter, 1981; Holley, 1981; McGuire, 1980; Molidor <5c Elstein, 
1979; Molidor, Elstein <3c Scheifley, 1980). Psychometrically the problem is that the 
Science Problems subtest partakes of the same error component of the other subtests, 
violating the assumption of uncorrected errop variance, raising serious interpretative 
questions in multivariate analyses such as factor analysis (Gorsuch,* 1974). When 
independent variables such as these are highly correlated in multiple regression analyses, 
"not only do the estimated regression coefficients tend to be quite imprecise, but the true 
regressioh coefficients tend to lose their meaning" (Neter <5c Wasserman, 1974). On the 
other hand, multicollinear variables have been included in the same analyses when strong 
rationale for their inclusion has been given. It is likely that the Science Problems subtest 
has been included in prediction equations used to make admission decisions - at 



Incremental Validity 9 

many medical schools. A discussion of the incremental and predictive validity and 
usefulness of the New MCAT Science Problems subtest in predicting medical student basic 
and clinical science performance is presented elsewhere (Wolf, Calhoun, Maxim & Davis, 
1983). 

All Possible Subsets Regression Analyses - 

\ / 
All possible subsets regression analyses (Frane, 1981) including ^the five New and 

four Old MCAT subtests are reported for each of the criterion measures. "The only way 

to be sure of obtaining the best n of N predictors would be to determine the multiple 

correlation for every such "set" by using an exhaustive procedure (Lord Novick, 1968, p. 

288). Until recently the economic cost of performing such analyse^ was prohibitive. 

However, "one major advance of the past decade ifif multiple regression has been the 

replacement of stepwise procedures with all possible subset searches for model selection, 

served by the C p plot" (Wainer <3c Thissen, 1981, p. 313). Use of the Furnival-Wilson (1974) 

algorithm enables the identification of "subsets while computing only a small fraction*of 

all possible regressions. Computer costs are comparable for stepwise regression for up to 

about 25 independent variables" (Frane, 1981, p. 264). For a discussion of some of the 

problems and issues related to stepwise procedures, see Cohen and Cohen (1975). 

Virtually all the studies encountered in the MCAT literature have used stepwise 

procedures in regression analyses, another distinction from the present study. 

Mallow's C was the criterion used to identify the best subsets. The "best" subset is 
P • \ 

selected on the basis of an analysis of residuals that minimizes C based on the following 

p y 

formula (Daniel <5c Wood, 1971; Frane, 1981): 

C = RSS - (N-2p0 (4) 

p 2 ^ 

( 
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where 

/ p , RSS= residual sum of squares for the subset of independent variables being tested 
2 •• — . 

s = residual mean square based on the regression using all independent variables 

p'= the number of variables in the subset, including the intercept, i?any. 

^ n= number of cases (sample size) - , r 

In additi6n, multiple R s and adjusted R s based on formula 5 were calculated. 

Adjusted R 2 = R 2 -p(l-R 2 ) *' V . (5) 

N - p' 

where p = the .number of independent variables when the intercept is set to Zero. 

• / V 

These analyses enabled an examination of which potential preadmission measures the 
MCAT subtests and/or non-MGAT- measures, were included in the "best" regression model 
for each criterion. * | > 

Results and Discussion; New MCAT ' 
Results are presented seperately for the New and Old MCAT,' respectively, before 

0- * 

similarities and differences are summarized. Validity coefficients (i.e., correlations)/ 
among nori-MCAT and New MCAT preadmission variables and the four criterion 
performance measures are summarized in Table ! for both subsamples. Table Contains 
validity coefficients for all subjects (i.e., both subsamples combined)* Among the 
preadmission variables, all New MCAT subtests' were significantly correlated with each 
other in both subsamples and in the entire sample <2<.05, ranging between r = .23 and 
.74), except for the Reading - Biology correlation in subsample 2 (r = .11, n.s.). 
Undergraduate grades for science and non-science were significantly (£< .01) related in 
) 
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both sample 1 (r = .61) and sample 2 (r = .69). Even though specific coefficients vary from 
sample 1 to sample 2, in general these results are consistent with prior research and 
support the validity of the interrelationships. 

The findings regarding the preadmission interview rating were not replicated in the 
two subsamples. In sample 1, these ratings were significantly (£< .05) associated with 
both science and non-science GPAs, as well as with the MCAT Biology subtest. In sample 
2, these associations were not significantly different than phance. However, these ratings 
were significantly associated (£ <.p5) with two other MCAT subtests, Physics and 

•CP * 

Chemistry. Thus the results of the relationships between the interview ratings and the 
other preadmission measures are equivocal. The sSame may be said for the associations 
between the grade point indexes and the New MCAT subtests. In sample 1, 11 of 12 
validity coefficients were statistically significant, while only 3 of 12 were significant in 
sample 2. Thus the significant associations of science GPA with MCAT Physics and 
Chemistry performance and of non -science GPA with MCAT Chemistry performance were 
the only associations that were replicated in both samples. 

Five of the 6 validity coefficients among ^he criterion measures were replicated in 
the two subsamples. The following associations were significant and positive in both 
samples: CEF-PS with both NBME I and H and with CEF-IP, and NBME I with NBME H. 
There was a consistent non -significant chance association between CEF-IP and NBME II 
in the two samples. - Results for validity coefficients for CEF-IP and NBME I were 
inconsistent and thus ambiguous, as the coefficient was significant in sample 1 <£ = .2?, 
p <.05) and non -significant in sample 2 (r = .05). These results provide evidence of the 
concurrent validity of the CEF-PS, NBME I, and NBME II measures. 

Validity coefficients between the two CEF criterion measures and the preadmission J 
measures were generally disappointingly low in both subsamples, with only one coefficient 
attaining statistical significance in each sample (non-science GPA with CEF-IP in sample 
1 and MCAT biology with CEF-PS in sample 2). Thus there appears to be little, if any, 
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zero order association between these clinical clerkship evaluations and preadmission 
measures. 

All 12 validity coefficients between the 6 New MCAT subtests and NBME I and II 
performance were significant in sample 1, while 10 df 12 were significant in sample 2. 
These findings generally support the validity of the associations between the standardized 
admission measures (New MCAT) and the standardized medical school performance 
measures (NBME) that are found in most studies. However, consistent with prior research 
(Carline et al, 1983; Hull et al, 1981; 3ones <5c Thomae-Forgues, 1982), these coefficients 
were consistently higher than those between undergraduate CPAs and NBME performance. 
The preadmission interview rating significantly related to NBME performance in sample 2, 
but not in sample 1. 

In summary, the research hypothecs of a significant positive relationship between 

each of the MCAT subscores and the four criterion performance measures were rejected 

> 

in relation to the two clinical evaluation measures, but accepted for NBME Parts I and II 
performance. 

* Insert Tables 1 and 2 about here 



) 

Incremental Validity Results 

2 - y 

For sample 1, sample 2, and both samples combined, multiple R s indicated that all 
preadmission measures accounted for 8 percent, 16 percent, and 7 percent, respectively, 
of the variance in clinical problem solving evaluations (CEF-PS). New MCAT subtests 
accounted for the majority of this explained variance, 5 percent, 11 percent, and 4 
percent in the three samples. Tfiese later percentages constitute the incremental validity 
of the New MCAT (index 1). Results summarized in Table 3 indicate , for example, that 
MCAT subtests explained 2.2 times more variability in CEF-PS ratings than did 
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' non-MCAT measures (index 2), or 12 percent of the remaining variance not explained by 
the non-MCAT measures (index 3)* Results for the CEF-IP criterion measure were 
similar to CEF-PS results, with the absolute amounts of variance accounted for being 
generally small. Cohen (1977) provides indexes of effect size for multiple R 2 s for small (2 
percent shared variance), medium (13 percent), and large (26 percent) effects. 

Effect sizes improve considerably when NBME Parts I and n are the criteria. 
Multiple R s indicate that all preadmission measures explain between 25 percent and *7 
percent of the variance in NBME I and II performance. Based on Cohen's criteria, these 
may be considered large effects. Again, MCAT subtests accounted for the majority of 
this explained variance, with the amount of variability additionally explained by the 
MCAT ranging between 13 percent and 37 percent (index 1). These effects thus may be 
considered to be medium to large in magnitude. The most dramatic effect occured in 
sample 1 where MCAT explained 7A times (7*0 percent) more variability than non-MCAT 
measures in NBME Part II performance (index 2). This amounted to explaining 39 percent 
of the remaining variance in NBME II performance once non-MCAT variance was removed 
(index 3). In summary, these findings clearly support the incremental validity of the New 
MCAT subtests-in contributing to explained variance in the criterion measures. 



Insert Table 3 about here 



All Possible Subsets Regression Results s ^ 

These analyses were performed to examine which preadmission measures were 
included in the best regression models for predicting each criterion. Based on the 
selection criteria of minimizing the C p statistic for residuals, the following standardized 
regression models for CEF-PS were obtained for subsample.l (equation 6), subsample 2 
(equation 7), and the combined sample (equation 8): 



16 
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PS.l = At GPA.O + 6.32 (6) 

PS.2 ~ t . .20 Rating + .19 GPA.S + .28 Biology - .27 Chemistry - 1.09 (7) 

PS = .17 Rating + .17 Biology - .12 Chemistry - 1.34 (8) 

The following- models for CEF-IP were obtained for subsample 1 (equation 9), 
subsample 2 (equation 10), and the total sample (equation 11): 

IP.l = .31 GPA.O + 4.04 (9) 

IP.2 = .23 Rating + .16 Biology - .21 Chemistry - 0.87 <> (10) 

IP = .12 Rating* .17 GPA.O + .17 Biology - 0.16 Chemistry - 0.20 (11) 



In comparing equations 6 and 7 between the two subsamples for CEF-PS and 
equations 9 and 10 for CEF-IP, it is evident that these models are inconsistent and not 
cross-validated. This is perhaps not surprising given the small amount of variance 
accouhted for in the CEF measures by all the preadmission measures, individually or in 
combination. I 

The regression models developed for NBME performance faired somewhat better. 
The models for predicting NBME Part I peformance for each of the two subsamples and 
the combined sample are presented in equations 12-14 below. 

NBME 1.1 = .49 Biology + .26 Chemistry + 1.18 
NBME 1.2 = .25 Biology + .24 Chemistry + .17 Rating - 5.86 
NBME I = .36 Biology + .26 Chemistry 4 .10 Reading + ,12 Rating - 0.08 

In examining equations 12 and 13, it is clear that both the New MCAT Biology and 

Chemistry subtests are good predictors and should be included in the model for NBME I. 

t, . ■ ■ 

The result for the preadmission interview rating is ambiguous, as it was not validated in 




(12) 
(13) 
(14) 
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subsample 1. The model represented in equation 1* for all subjects was selected on the 
basis of having the smallest C p value (5.13), and should provide a more stable regression 
-model than either of the subsample models (Kerlinger & Pedhazur, 1973; Moiser, 1951). 
However, the model comprised of just Biology and Chemistry resulted in. a C value of 
7.03. Combined with Frane's (198:1) recommendation that only independent variables 
whose coefficients are significantly different from zero be retained, it may be unlikely 
that adding either the Reading (the beta coefficient of .10 was not significant, p. < .12) or 
Rating, ( = .12, 2 <-06) subtests would result in predictions substantially different from " 
excluding them from the model. This issue clearly necessitates further examination, as 
the beta coefficient for" Rating approached statistical significance. 

Models resulting for NBME Part II also contained similarities and differences, as 
evidenced by equations 15-17. 

NBME n.l = Al Biology + .2* Reading + .20 Quantitative - 0.35 (15) 
NBME II.2 = .30 Biology + .26 Chemistry + 2.38 ( 16 ) 
NBME n = .38 Biology + .20 Reading + .16 Quantitative + .12 Rating - 5.92 (17) 

Clearly the New MCAT Biology subtest is a component of the model for NBME II. 
Results for the MCAT Reading, Quantitative, and Chemistry subtests, and for interview „ 
ratings are not validated and remain equivocal. 

Table 4 summarizes the C p , multiple R 2 , adjusted R 2 , and r yy , values for the best 
subset regression models reported above. The r yy , coefficient of .65 for NBME I for 
sample 1 was obtained by correlating sample 1 (calibration sample) subjects' observed 
scores with their predicted scores based on the model derived with sample 2 (screening , 
sample). In general, squaring the r yy , coefficients from each sample and comparing them 
with the multiple R 2 or adjusted R 2 coefficients from the same sample indicates striking 
similarity and consistency for both NBME measures. The difference between multiple R 2 s 
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for the two samples, as well as the difference between r yy , coefficients, provides an 
estimate of the amount of shrinkage of the multiple correlation. In general, shrinkage 
decreases as sample sizes increase (Kerlinger & Pedhazur, 1973). Even though the ratio of 
subjects to the number of independent variables was approximately 9 or 10:1 for the two 
subsamples, these samples may still be considered relatively small for the types of 
analyses performed. As data become available for the graduating class of 1983, it would 
be useful to replicate these analyses with the entire classes of 1982 and f983 representing 
the two samples, in contrast to dividing the class of 1982 into two subsamples as reported 
here. 



Insert Table k about here 



Results and Discussion; Old MCAT 
Validity coefficients among the non-MCAT and Old MCAT preadmission variables 

and the four criterion performance measures are summarized in Table 5 for both 

f 

subsamples. Table 6 contains validity coefficients for both subsamples combined. Among 
the preadmission variables, all Old MCAT subtests were significantly correlated (p< .05) 
with each other in both subsamples and in the combined sample = .18 to .59), with the 
exception of the correlation between General Information and Quantitative for subsample 
2 (r = .10, n.s.). The only preadmission measure significantly associated with the 
preadmission interview ratings was the Old MCAT Quantitative subtest in both sample 1 
(r = .27, p<.01) and sample 2 (r = .23; p< .0*5). Undergraduate science grade point 
averages were significantly associated with the Quantitative and Science subtests in both 
samples (r = A6 to .58; £< .01), and with the Verbal subtest in sample 1 only. Non-science 
grades (GPA-O) correlated significantly with the Old MCAT Science subtest in sample 1 
only. GPA-O was consistently not related to the Verbal, Quantitative, and Science 
subtests in both samples. 
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Four of the six validity coefficients among the criterion measures were replicated in 
the two subsamples: there were significant positive associations between CEF-PS and 
CEF-IP, NBME I and NBME n, and CEF-PS and NBME.II. CEF-IP and NBME II were 
consistently not related except by chance. The associations between NBME I and the two 
CEF measures were significant in sample 2, but not in sample 1, making these 
relationships equivocal. 

Validity coefficients between the two CEF measures and the preadmission measures 
were again disappointingly low, as only the correlation between CEF-IP with the 
preadmission interview rating was significant (r = -.27; £ < .05). However, Afe- 
relationship was in the opposite direction than predicted and washed out when the two 
subsamples were combined in Table 6. Consistent with results previously presented in the 
summary of the New MCAT, there is little, if any, zero order association between the 
clinical clerkship evaluations and preadmission measures. 

In contrast to the above results for CEF, 10 of 14 correlations in sample 1 and 9 of 
1* correlations in sample 2 between NBME and preadmission measures were significant . 
(£<.05X. Cross-validated positive and significant associations included: NBME I with 
GPA-S, Old MCAT Verbal, General Information, and Science; NBME n with Old MCAT 
..Verbal and General Information. Equivocal, ambiguous results were found for NBME I 
with Rating (r = .26, £< .01 in sample 2; r = .10 in sample 1), GPA-O (r = .22, £ < .05 in 
sample 1; r = .1] sample 2), and Old MCAT Quantitative (r = .26, £ < .05 in sample 1; r = 
.20, £ <.06 in sample 2); for NBME II with Rating (r = .25, £ < .05 in sample 2; r = .11 in 
sample 1), GPA-S (r = .28, £ < .05 in lample 2; r = .22, £< .06 in sample 1), GPA-O (r = 
.24, £ <.05 in sample 1; r = .10 in sample 2), and Old MCAT Science (r = A5, £ < .01 in 
sample 1; jr = .15 in sample 2). Consistent chance relationships were found between Old 
MCAT Quantative and NBME II in both subsamples: 
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Insert Tables 5 and 6 about here 



Incremental Validity Results 

Results summarized in Table / indicate that little variance in CEF performance was 
explained by the preadmission measures. Multiply R 2 s including all predictors accounted 
for <f-6 percent of the varance in CEF-PS, and ft- 15 percent in CEF-IP. In contrast to 
results ir> Table 3 for the New MCAT summary, Old MCAT subtests generally accounted 
» for the same or less variance in CEF performance than did the nori-MCAT measures. 

Again, effect sizes improved considerably when NBME performance served as the 

criteria, with Old MCAT subtests contributing substantially to the amount of variance 

» 

accounted for in both NBME Parts I and II beyond that accounted for by non-MCAT 

2 v 
measures. R added by the Old. MCAT ranged between .10 and .25, which connote medium 

to large effects based on Cohen's (1977) criteria. 



Insert Table 7 about here 



All Possible Subsets Regression Results . 

Based on the criteria of minimizing Mallows C p residual statistic, the following 
regression models resulted for CEF-PS for subsample 1 (equation 18), subsample 2 
(equation 1 9), and the combined sample (equation 20): 



PS.1 = .13 GPA.S + 6.55 " - ( 18) 

PS.2 = .15 GPA.S + 8.37 ( 19) 

PS = .13 GPA.S + 7.20 (20) 

ERIC *i 
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Undergraduate Science grade point average consistently emerges in the best model 
for predicitng CEF-PS in both subsamples and the combined sample. Equations 21-23 
summarize the best models for CEF-IP, These results are ambiguous based on different 
models and the inconsistency in the sign of the preadmission interview rating in equations 
21 and 22. 

IP.l = .21 GPA.O - .30 Rating + 20.52 J (2 1) 

IP.2 = .15 Rating + 1.69 * V ' (22) 

IP = .15 GPA.O + 6.05 \ (23) 

Just as in the New MCAT analyses, the regression models developed for NBME 
performance faired fairly well when they are compared for each subsample. Models for 
predicting NBME Part I performance are presented in equations 24-26. 



(2*0 



NBME 1.1 = .14 GPA.S + .56 Science - 2.71 
NBME 1.2 = .36 GPA.S + .30 Science + .20 General + .19 Rating - 13.75 * (25) 
NBME I = -23 GPA.S + .43 Science + .12 General + .10 Rating - 8.2** (26) 

In examining equations 24 and 25, it is clear that both science GPA and the Old 

s 

MCAT Science subtest are good predictors and should be .included in the model for NBME 
I. Results for the Old MCAT General Information subtest and the preadmission interview 
Rating are equivocal. Models obtained for NBME D for subsample 1 (equation 27), 
subsample 2 (equation 28), and the combined sample (equation 29) are presented below: 

NBME II. 1 = .15 GPA.O + .35 Science + .21 General - 2.85 * (27) 

NBME II.2 = .28 GPA.S + .30 General + .16 Rating - 8.52 (28) 
NBME n = .20 GPA.S+.29 General+.19 Science-. 17 Quantitative+.l 7 Rating-8.02 (29) 
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. In comparing equations 27 and 28 it can be seen that Old MCAT General Information 
should be included in the prediction model for NBME n. Results for GpA-O, GPA-S, 
MCAT Science and Quantitative, and the interview Rating are\equivocal\ Table 8 
summarizes Mallow's C p Multiple R 2 , Adjusted Multiple R 2 , and cross\alidated 
composite correlations (r yyl ) for the best subset regression analyses using the 01d\MCAT. 
This table is analogous to Table 4 for the New MCAT and is interpreted similarly. 



Insert Table 8 about here 



V 

f 

Comparisons Between Old and New MCAT 
Comparing the incremental validity results for the New MCAT summarized in Table 
3 with results for the Old MCAT in Table 7, it is evident that the New MCAT explains a 
larger proportion of incremental variability in each of the four criterion measures than 
does the Old MCAT. This finding is consistent for all three indexes of incremental 
validity. This is also consistent with incremental validity results reported by Friedman 
and Porter (1981). 

Both the Old and New MCATS did account for additional unique predictive variance 
in all four outcome measures when they were included in incremental validity analyses 
after the non-MCAT measures were included. Consistent with the zero order 
correlational analyses, these increments were significant in explaining additional NBME 
Part I and II variance, but non-significant for CEF-PS and CEF-IP variance. 

In general, the incremental validity indexes reported in this study tended to be 
larger than those reported by Friedman and Bakewell (1980) and Friedman and Porter 
(1981). There were more non-MCAT measures included in their analyses, which not 
surprisingly accounted for additional variance in their criterion measures. This therefore, 
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left less remaining variance in their criteria for the MCAT to potentially account for than 
in the present study. Thus it is not surprising that the incremental validity indexes were 
larger in the present study. 

Conclusions 

> Several inferences may be made from these results. MCAT subscores do account for 
predictive and incremental validity in medical school performance measures. This is 
particularly true for standardized measures of both basic and clinical science performance 
measured by the NBME examinations. This is less true for house staff clinical evaluation 
ratings. V Several factors may mitigate against this later relationship, including the 
difference in format between the measures; MCAT uses a multiple choice format, while 
clinical evaluations are necessarily based on supervisor judgments indicated on 
behaviorally anchored rating scales. Restriction of range in performance, homogeneity of 
the sample, and low reliability of the measures could partially account for these weak 
relationships.' Clearly, homogeneity of the sample and restricted range of performance on 
these measures are tenable explanations as a result of high admission standards and the 
generally high level of student performance. As Carline and his colleagues (1982, p. 208) 
have pointed out, "intervention pi three years of study between entrance into medical 
school and completion of basic clinical training must act to decrease correlations between 
measurements. An additional limit on correlations is the inherent restriction of range' in 
preselection variables; only higher-scoring students are admitted to medical training". 
Thus "the lack of large correlation coefficients does not offer significant evidence to 
negate the utility of the MCAT as an aid in the selection of academically able students" 
(Carline et al, 1983, p. 25). Low reliability is not as tenable an explanation because of the 
acceptable reliabilities of both the MCAT and CEF (New MCAT Interpretative Manual, 
1977; Dielman et al, 1980). Additionally, the fact that the clinical ratings did correlate 
significantly with the NBME examinations in this study does provide support for their 
validity. 

v: ? 
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In summary, the findings of the present study support the following conclusions: 

1. Both the Old and New MCAT explain additional incremental variance in 
medical school performance measures beyond that already explained by 
non-MCAT measures. This is particularly true for standardized criteria (i.e. 
NBME). 

2. The New MCAT appears to have larger incremental validity coefficients than 
the Old MCAT. 

3. The effect sizes appear to be medium to" large when the criteria are National 
Boards. The effect sizes are very small when the criteria are clinical 
evaluation ratings. , 

H. Cross-validation analyses support the inclusion of the New MCAT Biology 
' subtest in prediction models of NBME Part I and II performance. These 
analyses also support the inclusion of the New MCAT Chemistry subtest in the 
prediction model for NBME Part I only. Results for the other New MCAT 
subtests were equivocal. 

5. • Old MCAT analyses parallel New MCAT findings for NBME I in that the Old 

Science subtest, replaced by the New Biology and Chemistry subtests, 
cross-validated. The inclusion of the Old MCAT General Information subtest 
was cross-validated in the prediction model for NBME Part II. 

6. Findings for prediction models for clinical evaluation performance ratings 
were equivocal and not cross-validated for either New or Old MCAT subtests. 
Several methodological recommendations for future studies are suggested. These 

include (1) the substitution of all possible subsets regression analyses for stepwise 
procedures, (2) cross-validation of correlational/regression analyses, and (3) the exclusion 
of the New MCAT Science Problems subtest in multivariate analyses. Issues related to 
this third recommendation are addressed elsewhere (Wolf et al, 1983). 



Incremental Validity 23 



REFERENCES 

Bulletin of Information and Description of National Board Examinations . Philadelphia: 
National Board of Medical Examiners, 1982. 

C&rline, j.d., Cullen, T.3. and Scott, C.S. Prediction of Clerkship Performance Using the 
New MCAT Examination: An Attempted Application of Canonical Redundancy 
Analysis. Proceedings of the 21st Annual conference on Research in Medical 
Education , Washington, D.C., 205-210, 1982. 

Carline, 3.D., Cullen, T.3., Scott, C.S., Shannon, N.F. and Schaad, D. Predicting 
'Performance During Clinical Years from the New Medical College Admission Test. 
Journal of Medical Education , 58:18-25, 1983. 

Cohen, 3. Statistical Power Analysis for the Behavioral Sciences (Rev. ed.). N.Y.: 
Academic Press, 1977. 

. \ 1 

Cohen, 3. and Cohen, P. Applied Multiple Regression/Correlation Analysis for the 
Behavioral Sciences . Hillsdale, N.3.:€rlbaum, 1975. 

Cullen, T.3., Dohner, C.W., Peckham, P.D., Samson, W.E., Schwar*, M.R. Predicting 
First-Quarter Test Scores from the New Medical College Admission Test. 3ournal of 
Medical Education , 55:393-398, 1980. 

Daniel, C. and Wood, F.S. Fitting Equations to Data . N.Y.: Wiley, 1971. 

Dawson-Saunders, B., and Doolen, D.R. An Alternative Method to Predict Performance: 
Canonical Redundacy Analysis. 3ournal of Medical Education , 56:295-299, 1981. 

Dieiman, T.E., Hull, A.L., and Davis, W.K. Psychometric Properties of Clinical 
Performance Ratings. Evaluation and the Health Professions , 3:103-117, 1980. 

Erdmann, 3.B. Validating the MCAT. 3ournal of Medical Education , 55:463-464, 1980. 
(editorial) 

Frane, 3. All Possible Subsets Regression. In W.3. Dixon and M.B. Brown (eds.). BMDP : 
Biomedical Computer Programs. Los Angeles: Univeristy of California Press, 
264-277, 1981. " 

* / 

Friedman, CP., and Bakewell, 3r., W.E. Incremental Validity of the New MCAT. 3ournal 
of Medical Education , 55:399-404, 1980. - 

Friedman, CP. and Porter, CQ. Incremental Validity: The Old and New MCflffs 
Compared. Proceedings of the 20th Annual Conference on Research in Medical 
Education , Washington, D.C, 251-256, 1981. 

Furnival, G.M. and Wilson, R.W. Regression by Leaps and Bounds. Technometrics , 
16:499-511, 1974. 

Gorsuch, R.L. Factor Analysis . Philadelphia: W.B. Saunders, 1974. 

Holiey, CD. Background Variables as * Predictors of Acadmic Performance . Paper 
presented at the Educational Conference of the American Association of Colleges of 
Osteopathic Medicine, Kansas City, 3une 1981. 



ft; 



Incremental Validity 24 

Hull, _A.L. House Officer and Attending Staff as Evaluators of Medical Student 
Performance, Evaluation and the Health Professions . 5:87-94, 1982. 

Hull, A.L., Calhoun, J.G. and Maxim, B.R. Predicting Medical School Performance Using 

• " £LS!?i Sil N ^ M w A1 t. Proceedings of t he 20th Annual Conference on Research in 
Medical Education. Washington, n-r,; i | | Qff | 

3 ° ne %^ ? n< J L h T a 1"r J ° rgUe . S ' M * A Fact0 «- Comparison of Old and New MCAT Scales.' 
Journal of Medical Education . 56;161-I66 t iqsm 

J ° ne %^*r;. an V h0 1 ma f" F 1 , 0 . rglJe / s ' M * f rom MCAT. to M.D.: Predicting Success in Medical 
acnooi. In R. L. Linn (Chair), Interpreting Data through Externally Smudged 
Glasses: The Continuing Saga o~ G raduate and Professional Sc hool Va lidation 
Research symposium presented at the meeting of the American Educational 
Research Association, New York, March 1982. 

^'S^W^lJSt' Statistical The0 " es °* Mental Test Scores. New York: 

Kerl Xt', asaiafe ^ ReRression in B ^ aviorai Research - n - y - ! 

MCG U E*£^ ^ MCdiCal Student P-formance. Journal of Medical 

Molider, J.B. and Elstein, A.S. A Factor Analytic Study of the Old and New MCAT 
Examinations. Proceedings of the 18th Annual Conference on Research in Me dical 
Education . Washington, D.C., 139-144, 1979. T 

Molidor, J B., Elstein, A.S., Scheifley, V. The Old and New MCAT Examination s: What 
go They Measure.? Paper presented at the meeting of the American Educational 
Research Association, Boston, April 1980. 

Moiser, C.I. Problems and Designs of Cross-Validation. Educational & Psy chological 
Measurement , 11:5-11, 1951. 1 ° 

Neter, J. and^Wasserman, W. Applied Linear Statistical Models . Homewood, II: R.D. 

rwin ' ' ; t r 

New Medical College Ad mission Test Interpretative M anual. Washington, D.C.: 
Association of American Medical Colleges, 1977. J 

Sechrest, L. Incremental Validity: A Recommendation. Educational and Psychological 
Measurement . 23:153-158, 1963. s 1 

Wainer, H. and Thissen, D. Graphical Data Analysis. In M.R. Rosenzweig & L. W. Porter 
(eds.). Annual Review of Psychology . 32:191-241. 19X1. 

WOlf ' J;n**f!^^' G Z^ 1 !?'. B * R " and Davis ' W * K - Predictive and Incremental 
. ^g ff of | he New MCAT Science Problems Subtest Presented at the meeting of 
the National council on Measurement in Education, Montreal, April 1983 



27 



Table 1 

Pearson Correlations Among Preadmission, New MCAT, Clinical Evaluation, 
and NBME I, II scores iot Two Subsamples of Class of 1982 

Sample 1 



Incremental 'Validity 
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6 


7 


8 


9 


10 


11 


12 


13 


1. 


RATING 


— — 


.22* 


.22* 


.22* 


-.02 


-.06 


.11 


-.08 


-.03 


.12 


.07 


.19 


.04 


2. 


GPA-S 


.16 




.61** 


.37** 


.36** 


.53** 


.52** 


.26** 


.3*** 


. .01 


.10 


.38** 


.23* 


3. 


GPA-O 


.07 


.69** 


— _ — 


.20* 


.11 


.23* . 


.22* 


.22*" 


.10 


.1* 


.31** 


.18 


.18 


4. 


JACAT-BI 


.1* 


. .07 


-.09 





.47** 


.48** 


.65** 


.26* 


.42** 


.0* 


.12 


.62** 


.56** 


5. 


MCAT-PH 


.21* 


.2** 


.02 


.32** 




.62** 


.63** 


.23* 


.38** 


.08 


41 


.47** 


.37** 


6. 


MCAT-CH 


.25 


.37** 


.21* 


.30** 


.51** 





.7*** 


.33** 


' .44** 


-.01 


.05 


.49** 


.36** 


7. 


MCAT-SP 




.09 


-.07 


.58** 


.62** 


.58** 




.26* 


.47** 


.10 


.14 


.55** 


.43** 


8. 


MCAT-RE 


-.02 


.13 


.16 


.11 


.26** 


.28** 


.32** 




.29** 


.01 


-.03 


.30** 


.40** 


9. 


MCAT-QA 


-.1* 


.20 


-.0* 


.23* 


.35** 


.38** 


.45** 


.26* 




.12 


.04 


.32** 


.43** 


10. 


CEF-PS 


.19 


.13 


.0* 


. .2** 


-.00 


-.07 


.09 


.03 


.03 




.72** 


.27* 


.32** 


11. 


CEF-IP 


.18 


.07 


-05 


.13 


-.03 


-.11 


.05 


-.0* 


-.0* 


.70** 




.19 


.23* 


12. 


NBME-I 


.27** 


\19 


.01 


.35** 


.29** 


.35** 


.38** 


.17 


.24* 


.46** 


.16 




.81** 


13. 


NBME-J . 


.24* 


.12 


.01 


.37** 


.21 


.3*** 
Sample 2 


.39** 


.21* 


.33* 


.30** 


.05 


.78** 





Note: RATING = preadmission interview rating; GPA = grade point average, S = science, O = other; MCAT = New version, BI 
PH = physics, CH = chemistry, SP = science problems, RE = reading, QA = quantitative; CEF = clinical evaluation form,- 
PS = problem solving; IP = interpersonal skills. 
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Table 2 



Pearson Correlations Among Preadmission, New MCAT, Clinical Evaluation, 
and NBME I, II scores for All Students of Class of 1982 



Rating 
1 



CPA 



CEF 



S 
2 



O 
3 



BI 
4 



PH 

5 



CH 
6 



SP 

7 



RE 
8 



QA 
9 



PS 
10 



IP 
11 



1. 


RATING 




• 


















2. 


GPA-S 


.19** 














♦ 






3. 


GPA-O 


.14* 


.64** 


















4. 


MCAT-BI 


.19** 


.25** 


.06 
















5. 


MCAT-PH 


.01 


.30** 


.06 


.40** 














6. 


MCAT-CH 


-.,01 


.46** 


.22** 


.40** 


.57** 












7. 


MCAT-SP 


.10 


.35** 


.08 


.63** 


.62** 


.67** 










8. 


MCAT-RE 


-.06 


.21** 


.19** 


.20** 


.25** 


.31** 


.29** 








9. 


MCAT-QA 


-.03 


.28** 


.03 


.34** 


.36** 


.42** 


.46** 


.28** 






10. 


CEF- PS 


.16* 


.08 


.09* 


.13 


.04 


-.04 


.09 


.02 


.07 




11. 


CEF-IP 


.13 


.07 


.15* 


.12 


.03 , 


-.05 


.09 * 


-.04 


-.01 


.71** 


12. 


NBME-I 


.1,8* 


.29** 


. 9 9 


.51** 


.38** 


-.43** 


.48** 


.24** 


.29** 


.36** 


13. 


NBME-II 


.13 


.19* 


.08 


.48** 


.29** 


.36** 


.42** 


.32** 


.35** 


.31** 



NBME 
12 



.16* 
.12 



.80** 



Note: RATING = preadmission interview rating; GPA = grade point average, S = Science, O = other; MCAT = New version, BI = biology, 
PH = physics, CH = chemistry, SP = science problems, RE = reading, QA = quantitative; CEF = clinical evaluation form, 
PS = problem solving; IP = interpersonal skills. 



* *p <.05 
**p <.01 
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Table 3 

Incremental Validity for New MCAT 
for Class of 1982 



* Criterion 
Measure 



Statistic 



CEF - PS 



Sajnple Si^e (n) 
non-MCAT 
R Z added by MCAT (1) 
Total R 

Incremental Validity (2) 
Incremental Validity (3) 



Sarnple 1 
84 

.03 
.05 
.08 
1.67 
.05 



Sample 2 



87 

.05. 
.11 
.16 
2.20 
.12 



All 
Subjects 



171 



.03 
.04 
.07 
1.33 
.04 



CEF-IP 



Sample Size (n) 

R, non-MCAT 

R added by MCAT (1) 

Total R z ^ 

Incremental Validity (2) 

Incremental Validity (3) 



84 



.10 
.04 
.14 
.40 
.04 



87 

.04 
.08 
.12 
2.00 
.08 



171 



.04 
.04 
.08 
1.00 
.02 



NBME I 



Sample Size (n) 

R, non-MCAT 

R^ added by MCAT (1) 

Total R z 

Incremental Validity (2) 
Incremental Validity (3) 



92 



.16 
.31 
.47 
1.94 
.37 



93 

.12 
.13 
.25 
1.08 
.15 



185 



.12 
.24 
.36 
2.00 
.27 



NBME n 



Sample Size (n) 

R, non-MCAT 

R z added by MCAT (1) 

Total R z 

Incremental Validity (2) 
Incremental Validity (3) 



81 



.05 
.37 
.42 
7.40 
.39 



85 

.07 
.19 
.26 
2.71 
.23 



166 



.05 
.27 
.32 
5.40 
.28 
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Table ft 



Mallow's C p , Multiple R 2 , Adjusted Multiple R 2 , 

and Cross-Validated Composite Correlations 
(r ,) for Best Subset Regression Analyses 
yy Using New MCAT Predictors 



Criterion 
Measure 



Adj 



yy 



CEF 



- PS 

Sample 1 
Sample 2 
All Subjects 



84 


-1.60 


.02 


.01 


.07 




87 


1.80 


.15 


.11 


.04 




171 


1.59 


.05 


.04 







CEF - IP 



Sample 1 


84 


-2.13 


.10 


.08 


.06 


Sample 2 




0.86 


.09 


.05 


.05 


All Subjects 


171 


1.98 


.07 


.05 




NBME I 












Sample 1 


92 


2.79 


.43 


.42 


.65 


Sample 2 


93 


2.52 


.22 


.19 


' .42 


All Subjects 


185 


5.13 


.34 ' 


- .33 




NBME n 












Sample 1 


81 


0.11 


.41 


.39 


.53 


Sample 2 


85 


1.87 


.20 


.18 


.41 


All Subjects * 


166 


2.75 


.32 


.30 





( 



0 

ERIC 



3 % i 



I 

Tabled 

Pearson Correlations Among Preadmission, Old MCAT, Clinical Evaluation 
and NBME I, II Scores for Two Subsamples of Class of 1981 





• 


Rating 


GPA 






Sam 


pie 1 




CEF 




NBME 






1 


s- 

2 


O 

3 ' 


VA 
4 


QA 
5 


GI 
6 


SCI 
7 


PS 
8 


IP 
9 


I 

10 


II 
11 


1. 


RATING 




.16 


.13 


.15 


.27** 


.02 


.09 


-.08 


-.27* 


.10 


.11 


2. 


GPA-S 


.18 




.59** 


.23* 


.58** K 


.15 


.51** 


• 13 


.14 


.1*0** 


.22 


3. 


GPA-O 


.17 


.1*2** 





.17 


• 15 


.13 


.25* 


.13 


.18 


.22* 


.2k* 


4. 


MCAT-VA 


.16 


.15 


.14 




.38** 


.61** 


Ak** 


-.09 


-.05 


.28** 


.29** 


5. 


MCAT-QA 


.23* 


.1*6** 


.09 


.22* 




.26* 


.1*3** 


-.08 


.01 


.26* 


.on 


6. 

7. 


MCAT-GI 


.15 


.08 


-.01 


.57** 


.10 




.34** 


.06 


.12 


.2** 


.37** 


MCAT-SCI 


.13 


.5*** 


.15 


.29** 


.52** 


.24* 




.07 


.09 


.62** 


.45** 


8. 


CEF-PS 


.09 


.10 


.06 


-.01 


.00 


-.0* 


.13 




.66** 


.17 


.23* 


9. 


CEF-IP 


.11 


.01 


.12 


.02 


.04 


-.10 


.08 


.61** 


/ 


.00 


.01* 


10. 


NBME-I 


.26* 


.45** 


.11 


.28** 


.20 


.25* 


.55** 


AO** 


.26* 




.80** 


11. 


NBME-D 


.25* 


.28* 


.10 


.26* 


.0* 
Sample 2 


.30** 


.15 


.30** 


.10 


.71** - 





Note: RATING = preadmission interview rating; GPA - grade point average, S = science, O = other; 

MCAT = Old version, BI = biology, PH = physics, CH = chemistry, SP = science problems, RE s reading, QA a quantitative; 

CEF = clinical evaluation fo(rm, PS = problem solving; IP = interpersonal skills. 

*p <.05 
**p < # 01 
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Pearson Correlations Among Preadmission, Old MCAT, Clinical Evaluation, 
and NBME I, II scores for All Students of Class of 1981 





Rating 


GPA 












CEF 


1 

NBME I 






1 


5 
2 


O 
3 


VA 

4 


QA 
5 


GI 
6 


SCI 
7 


PS 4 IP 
8 9 


10 


1. 


RATING 




















2. 
3. 


GPA-S 
GPA-O 


.17* 
.15* 


.53* • 




- 












4. 


MCAT-VA 


.15* 


.20* • 


.16* 














5. 


MCAT-QA 


.25** 


.52** 


.12 


.30** 












6. 


MCAT-GI 


.08 


.12 


.08 


.59** 


.18* 










7. 


MCAT-SCI 


.11 


.52* * 


.21** 


.36** 


.48** 


.29** 








8. 


CEF-PS 


.0* 


.13 


.10 


-.04 


-.03 


.02 


.11 






9. 


CEF-IP 


-.05 


.09 


.15 


*-.01 


.03 


.02 


.09 


.65** 




10. 


NBME-I 


.18* 


.45** 


.18* 


.30** 


.29** 


.27** 


.58** 


.27** .12 




11. 


. NBME-n 


.18* 


.25** 


.18* 


.28** 


.04 


.34** 


.31** 


.26** .07 


.76** 



Note: RATING ■ preadmission interview rating; GPA = grade point average, S = science, O = other; MCAT = old version, 

BI = biolortV, PH = physics, CH = chemistry, SP = science problems, RE = reading, QA = quantitative; CEF = clinical evaluation 

form, PS iqjroblem solving, IP = interpersonal skills. 



*p K .05 
**p < .01 



3 



3d 



0 
o 



0 
rr 

s 5 



a. 

rr 



O 



Incremental Validity 31 



Table 7 

Incremental Validity for Old MCAT 
lor Class of 1981 



Criterion 
Measure 



Statistic 



Sample 1 


Sample 2 


All 
Subiects. 


76 


79 




.03* 


.04 


1 .02 


.03 


.01 


.02 


.06 


.05 


.t)4 


1.00 


.25 


1.00 


.03 


.01 


02 


76 


79 


(155 


.13 


.03 


1 .03 


.02 


.02 


.01 


.15 


.05 - 


.04 




.67 


.33 


.02 


.02 


.01 


93 


88 


181 


.16 


.33 


.22 


.25 


.15 


.19 


.41 


.48 


.41 


1.56 


.45 


.86 


.30 


.22 


.24 


78 


81 


159 


.07 


.14 


.09 


.22 


v».10 


.14 


.29 


.24 


.23 


3.14 


.71 


1.56 


.27 


.12 


.17 



CEF - PS 



CEF-IP 



riBME I 



NBME II 



Sample Size (n) 

non-MCAT 

added by MCAT (1) 
Total R z 

Incremental Validity (2) 
Incremental Validity (3) 

5ample Size (n) 

R, non-MCAT 

R z added by MCAT (1) 

Total R z 

Incremental Validity <2) 
Incremental Validity (3) 

Sample. Size (n) 

R, non-MCAT 

R z added by MCAT (1) 

Total R z ' 

Incremental Validity (2) 

Incremental Validity (3) 

Sample Size (n) 

R, non-MCAT 

R z added by MCAT (1) 

Total R z 

Incremental Validity (2) 
Incremental Validity (3) 
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Table 8 

Mallow's C p , Multiple R 2 , Adjusted Multiple R 2 , 

and Cross -Validated Composite Correlations 
(r ,) for Best Subset Regression Analyses" 
yy Using Old MCAT Predictors 



Criterion 
Measure 



n 



Adj 
R^ 



yy. 



CEF 



- PS 

Sample 1 
Sample 2 
All Subjects 



76 
79 
155 



-0.46 
-2.53 
-0.86 



.02 .00 
.02 .01 
.02 .01 



.13 
.10 



CEF - IP 

Sample 1 
* Sample 2 

All Subjects 

NBME I 

Sample 1 
Sample 2 
All Subjects 



76 
79 



93 
88 
181 



0.84 
-2.04 
-0.86 



-1.38 
5.53 
4.86 



.12 
.02 
.02 



.40 
.47 
.41 



.10 
.01 
.02 



.39 
.44 
.39 



.27 
.07 



.57 
.60 



NBME II 

Sample 1 
Sample 2 
All Subjects 



78 
81 
159 



2.27 
2.18 
4.85 



.27 
.21 
.23 



.24 
.18 



.40 
.30 
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